6 research outputs found

    Contract-Based Programming on Modern C++

    Get PDF
    Contract-based Programming or Design By Contract (DBC) is a discipline for system construction that in recent years has postulated to be one of the most solid and reliable models for software creation. It is well known that in the software industry the number of projects not being successfully developed is huge. The main cause of the failure is that projects do not meet user expectations. In this context, Design By Contract seems to emerge as a solution to decrease this failure rate. This philosophy provides a set of mechanisms for the validation of part of the requirements specification. In recent years, several programming languages started to implement DBC, either as part of the language or an external library. The main programming languages that support contract-based programming are Ada 2012, Spark, Eiffel, D, C# CodeContracts or Microsoft Source-Code Annotation Language (Microsoft SAL). Traditionally, C++ has been a programming language focused on flexibility, performance and efficiency. This has attracted many people to carry out projects using this programming language. However, trends make programming languages change, and the interests of the industry are leaning towards solid solutions. Those solutions shall include frameworks that are reliable. With this same purpose, C++ has designed a specification for the implementation of Design By Contract in the programming language. This new specification has been accepted by the ISO C++ committee to be included in C++20. The specification includes several clauses that allow the user to write pre/postconditions on the code. This allows part of the requirement specification to be merged into the code, enabling traceability between the phases of the software project. The specification of a new feature in a programming language implies changes in how the language is understood by a compiler. For the implementation of a new specification, several changes are required at different levels. This document describes these changes. Additionally, it provides an overview of the structure of a compiler, and a brief description of all the parts of the Clang C++ compiler.La programaci贸n por contratos es una disciplina de construcci贸n de sistemas que recientemente se ha postulado como una de las m谩s solidas y fiables para la creaci贸n de sistemas software. Se sabe que la industria de desarrollo de software no est谩 siendo exitosa debido en parte a la tasa de fallos que hay en 茅stos. En este contexto, la programaci贸n por contratos emerge como una soluci贸n para reducir esta tasa de fracaso en la industria. Esta tendencia de desarrollo provee a los usuarios con mecanismos para la validaci贸n de los requisitos. En los 煤ltimos a帽os, varios lenguajes de programaci贸n han comenzado a implementar la programaci贸n por contratos, bien como parte del lenguaje o como una biblioteca externa. Los principales lenguajes de programaci贸n que a d铆a de hoy soportan programaci贸n por contratos son Ada 2012, Spark, Eiffel, D, C# CodeContracts or Microsoft Source-Code Annotation Language (Microsoft SAL). Tradicionalmente, C++ ha sido un lenguage de programaci贸n centrado en proveer al usuario con flexibilidad, rendimiento y eficiencia. Estas caracter铆sticas han atraido a muchos clientes de cara a utilizar este lenguaje de programaci贸n en proyectos. Sin embargo, las tendenc铆as fuerzan cambios en los lenguajes de programaci贸n, y los intereses de las empresas actualmente se est谩n inclinando hacia soluciones robustas. Estas soluciones, deben incluir marcos de trabajo que sean fiables. Con esto en mente, se ha dise帽ado una especificaci贸n para la programaci贸n por contratos en el lenguaje de programaci贸n. Esta nueva especificaci贸n, ha sido aceptada para por el Comite ISO C++ para ser incluida en C++20. Esta especificaci贸n provee al usuario con varios mecanismos que permiten verificar condiciones en el c贸digo. Esto permite directamente enlazar la especificaci贸n de requisitos con la implementaci贸n de los mismos. La especificaci贸n de una nueva caracter铆stica dentro de un lenguaje de programaci贸n implica cambios en como el lenguaje es entendido por un compilador. Para la implementaci贸n de estos nuevos requisitos se requiere de realizar modificaciones en el compilador en distintos niveles de an谩lisis. En este proyecto, se describe un resumen de los cambios que son necesarios dentro de un compilador. Estos cambios incluyen un resumen de la estructura del compilador, posteriormente se desglosa la estructura del compilador de C++ Clang y por 煤ltimo se describen las modificaciones en cada una de las partes involucradas.Ingenier铆a Inform谩tic

    Towards Improved Homomorphic Encryption for Privacy-Preserving Deep Learning

    Get PDF
    Menci贸n Internacional en el t铆tulo de doctorDeep Learning (DL) has supposed a remarkable transformation for many fields, heralded by some as a new technological revolution. The advent of large scale models has increased the demands for data and computing platforms, for which cloud computing has become the go-to solution. However, the permeability of DL and cloud computing are reduced in privacy-enforcing areas that deal with sensitive data. These areas imperatively call for privacy-enhancing technologies that enable responsible, ethical, and privacy-compliant use of data in potentially hostile environments. To this end, the cryptography community has addressed these concerns with what is known as Privacy-Preserving Computation Techniques (PPCTs), a set of tools that enable privacy-enhancing protocols where cleartext access to information is no longer tenable. Of these techniques, Homomorphic Encryption (HE) stands out for its ability to perform operations over encrypted data without compromising data confidentiality or privacy. However, despite its promise, HE is still a relatively nascent solution with efficiency and usability limitations. Improving the efficiency of HE has been a longstanding challenge in the field of cryptography, and with improvements, the complexity of the techniques has increased, especially for non-experts. In this thesis, we address the problem of the complexity of HE when applied to DL. We begin by systematizing existing knowledge in the field through an in-depth analysis of state-of-the-art for privacy-preserving deep learning, identifying key trends, research gaps, and issues associated with current approaches. One such identified gap lies in the necessity for using vectorized algorithms with Packed Homomorphic Encryption (PaHE), a state-of-the-art technique to reduce the overhead of HE in complex areas. This thesis comprehensively analyzes existing algorithms and proposes new ones for using DL with PaHE, presenting a formal analysis and usage guidelines for their implementation. Parameter selection of HE schemes is another recurring challenge in the literature, given that it plays a critical role in determining not only the security of the instantiation but also the precision, performance, and degree of security of the scheme. To address this challenge, this thesis proposes a novel system combining fuzzy logic with linear programming tasks to produce secure parametrizations based on high-level user input arguments without requiring low-level knowledge of the underlying primitives. Finally, this thesis describes HEFactory, a symbolic execution compiler designed to streamline the process of producing HE code and integrating it with Python. HEFactory implements the previous proposals presented in this thesis in an easy-to-use tool. It provides a unique architecture that layers the challenges associated with HE and produces simplified operations interpretable by low-level HE libraries. HEFactory significantly reduces the overall complexity to code DL applications using HE, resulting in an 80% length reduction from expert-written code while maintaining equivalent accuracy and efficiency.El aprendizaje profundo ha supuesto una notable transformaci贸n para muchos campos que algunos han calificado como una nueva revoluci贸n tecnol贸gica. La aparici贸n de modelos masivos ha aumentado la demanda de datos y plataformas inform谩ticas, para lo cual, la computaci贸n en la nube se ha convertido en la soluci贸n a la que recurrir. Sin embargo, la permeabilidad del aprendizaje profundo y la computaci贸n en la nube se reduce en los 谩mbitos de la privacidad que manejan con datos sensibles. Estas 谩reas exigen imperativamente el uso de tecnolog铆as de mejora de la privacidad que permitan un uso responsable, 茅tico y respetuoso con la privacidad de los datos en entornos potencialmente hostiles. Con este fin, la comunidad criptogr谩fica ha abordado estas preocupaciones con las denominadas t茅cnicas de la preservaci贸n de la privacidad en el c贸mputo, un conjunto de herramientas que permiten protocolos de mejora de la privacidad donde el acceso a la informaci贸n en texto claro ya no es sostenible. Entre estas t茅cnicas, el cifrado homom贸rfico destaca por su capacidad para realizar operaciones sobre datos cifrados sin comprometer la confidencialidad o privacidad de la informaci贸n. Sin embargo, a pesar de lo prometedor de esta t茅cnica, sigue siendo una soluci贸n relativamente incipiente con limitaciones de eficiencia y usabilidad. La mejora de la eficiencia del cifrado homom贸rfico en la criptograf铆a ha sido todo un reto, y, con las mejoras, la complejidad de las t茅cnicas ha aumentado, especialmente para los usuarios no expertos. En esta tesis, abordamos el problema de la complejidad del cifrado homom贸rfico cuando se aplica al aprendizaje profundo. Comenzamos sistematizando el conocimiento existente en el campo a trav茅s de un an谩lisis exhaustivo del estado del arte para el aprendizaje profundo que preserva la privacidad, identificando las tendencias clave, las lagunas de investigaci贸n y los problemas asociados con los enfoques actuales. Una de las lagunas identificadas radica en el uso de algoritmos vectorizados con cifrado homom贸rfico empaquetado, que es una t茅cnica del estado del arte que reduce el coste del cifrado homom贸rfico en 谩reas complejas. Esta tesis analiza exhaustivamente los algoritmos existentes y propone nuevos algoritmos para el uso de aprendizaje profundo utilizando cifrado homom贸rfico empaquetado, presentando un an谩lisis formal y unas pautas de uso para su implementaci贸n. La selecci贸n de par谩metros de los esquemas del cifrado homom贸rfico es otro reto recurrente en la literatura, dado que juega un papel cr铆tico a la hora de determinar no s贸lo la seguridad de la instanciaci贸n, sino tambi茅n la precisi贸n, el rendimiento y el grado de seguridad del esquema. Para abordar este reto, esta tesis propone un sistema innovador que combina la l贸gica difusa con tareas de programaci贸n lineal para producir parametrizaciones seguras basadas en argumentos de entrada de alto nivel sin requerir conocimientos de bajo nivel de las primitivas subyacentes. Por 煤ltimo, esta tesis propone HEFactory, un compilador de ejecuci贸n simb贸lica dise帽ado para agilizar el proceso de producci贸n de c贸digo de cifrado homom贸rfico e integrarlo con Python. HEFactory es la culminaci贸n de las propuestas presentadas en esta tesis, proporcionando una arquitectura 煤nica que estratifica los retos asociados con el cifrado homom贸rfico, produciendo operaciones simplificadas que pueden ser interpretadas por bibliotecas de bajo nivel. Este enfoque permite a HEFactory reducir significativamente la longitud total del c贸digo, lo que supone una reducci贸n del 80% en la complejidad de programaci贸n de aplicaciones de aprendizaje profundo que usan cifrado homom贸rfico en comparaci贸n con el c贸digo escrito por expertos, manteniendo una precisi贸n equivalente.Programa de Doctorado en Ciencia y Tecnolog铆a Inform谩tica por la Universidad Carlos III de MadridPresidenta: Mar铆a Isabel Gonz谩lez Vasco.- Secretario: David Arroyo Guarde帽o.- Vocal: Antonis Michala

    A methodology for large-scale identification of related accounts in underground forums

    Get PDF
    Underground forums allow users to interact with communities focused on illicit activities. They serve as an entry point for actors interested in deviant and criminal topics. Due to the pseudo-anonymity provided, they have become improvised marketplaces for trading illegal products and services, including those used to conduct cyberattacks. Thus, these forums are an important data source for threat intelligence analysts and law enforcement. The use of multiple accounts is forbidden in most forums since these are mostly used for malicious purposes. Still, this is a common practice. Being able to identify an actor or gang behind multiple accounts allows for proper attribution in online investigations, and also to design intervention mechanisms for illegal activities. Existing solutions for multi-account detection either require ground truth data to conduct supervised classification or use manual approaches. In this work, we propose a methodology for the large-scale identification of related accounts in underground forums. These accounts are similar according to the distinctive content posted, and thus are likely to belong to the same actor or group. The methodology applies to various domains and leverages distinctive artefacts and personal information left online by the users. We provide experimental results on a large dataset comprising more than 1.1M user accounts from 15 different forums. We show how this methodology, combined with existing approaches commonly used in social media forensics, can assist with and improve online investigations.This work was partially supported by CERN openlab, the CERN Doctoral Student Programme, the Spanish grants ODIO (PID2019-111429RB-C21 and PID2019-111429RB) and the Region of Madrid grant CYNAMON-CM (P2018/TCS-4566), co-financed by European Structural Funds ESF and FEDER, and Excellence Program EPUC3M1

    HEFactory: A symbolic execution compiler for privacy-preserving Deep Learning with Homomorphic Encryption

    No full text
    Homomorphic Encryption (HE) allows computing operations on encrypted data, and it is a potential solution to enable Deep Learning (DL) in privacy-enforcing scenarios (e.g., sending private data to cloud services). However, HE remains a complex technology with multiple challenges that prevent successful application by non-experts. In this work, we present HEFactory, a program compiler that effectively assists in building HE applications in Python for both general-purpose and Deep Learning applications, focusing on non-expert data scientists. HEFactory relies on a layered architecture that deals with challenges such as automatic parameter selection and specific data representation of HE applications. Our benchmarks show that HEFactory substantially lowers the programming complexity (i.e., a reduction of 80% in the number of lines of code) with negligible performance overhead over programs written by experts using native HE frameworks

    Towards automated homomorphic encryption parameter selection with fuzzy logic and linear programming

    No full text
    Homomorphic Encryption (HE) is a set of powerful properties of certain cryptosystems that allow for privacy-preserving operation over the encrypted text. Still, HE is not widespread due to limitations in terms of efficiency and usability. Among the challenges of HE, scheme parametrization (i.e., the selection of appropriate parameters within the algorithms) is a relevant multi-faced problem. First, the parametrization needs to comply with a set of properties to guarantee the security of the underlying scheme. Second, parametrization requires a deep understanding of the low-level primitives since the parameters have a confronting impact on the precision, performance, and security of the scheme. Finally, the circuit to be executed influences, and it is influenced by, the parametrization. Thus, there is no general optimal selection of parameters, and this selection depends on the circuit and the scenario of the application. Currently, most of the existing HE frameworks require cryptographers to address these considerations manually. It requires a minimum of expertise acquired through a steep learning curve. In this paper, we propose a unified solution for the aforementioned challenges. Concretely, we present an expert system combining Fuzzy Logic and Linear Programming. The Fuzzy Logic Modules receive a user selection of high-level priorities for the security, efficiency, and performance of the cryptosystem. Based on these preferences, the expert system generates a Linear Programming Model that obtains optimal combinations of parameters by considering those priorities while preserving a minimum level of security for the cryptosystem. We conduct an extended evaluation where we show that an expert system generates optimal parameter selections that maintain user preferences without undergoing the inherent complexity of analyzing the circuit

    Towards realistic privacy-preserving deep learning over encrypted medical data

    No full text
    Cardiovascular disease supposes a substantial fraction of healthcare systems. The invisible nature of these pathologies demands solutions that enable remote monitoring and tracking. Deep Learning (DL) has arisen as a solution in many fields, and in healthcare, multiple successful applications exist for image enhancement and health outside hospitals. However, the computational requirements and the need for large-scale datasets limit DL. Thus, we often offload computation onto server infrastructure, and various Machine-Learning-as-a-Service (MLaaS) platforms emerged from this need. These enable the conduction of heavy computations in a cloud infrastructure, usually equipped with high-performance computing servers. Unfortunately, the technical barriers persist in healthcare ecosystems since sending sensitive data (e.g., medical records or personally identifiable information) to third-party servers involves privacy and security concerns with legal and ethical implications. In the scope of Deep Learning for Healthcare to improve cardiovascular health, Homomorphic Encryption (HE) is a promising tool to enable secure, private, and legal health outside hospitals. Homomorphic Encryption allows for privacy-preserving computations over encrypted data, thus preserving the privacy of the processed information. Efficient HE requires structural optimizations to perform the complex computation of the internal layers. One such optimization is Packed Homomorphic Encryption (PHE), which encodes multiple elements on a single ciphertext, allowing for efficient Single Instruction over Multiple Data (SIMD) operations. However, using PHE in DL circuits is not straightforward, and it demands new algorithms and data encoding, which existing literature has not adequately addressed. To fill this gap, in this work, we elaborate on novel algorithms to adapt the linear algebra operations of DL layers to PHE. Concretely, we focus on Convolutional Neural Networks. We provide detailed descriptions and insights into the different algorithms and efficient inter-layer data format conversion mechanisms. We formally analyze the complexity of the algorithms in terms of performance metrics and provide guidelines and recommendations for adapting architectures that deal with private data. Furthermore, we confirm the theoretical analysis with practical experimentation. Among other conclusions, we prove that our new algorithms speed up the processing of convolutional layers compared to the existing proposals
    corecore